Method for image processing by means of a neural network and method for training a neural network

ABSTRACT

A computer-implemented method for processing an image includes steps of dividing the image into at least two image regions, grouping at least one of the image regions into at least one group wherein each of the image regions is assigned to exactly one group or to no group, and applying at least one kernel in a layer of a neural network to the image regions of at least one group, wherein at most one kernel is applied to the image regions of each group. The disclosure further relates to a computer-implemented method for training a neural network, to a device for processing an image in a neural network, and to a computer-readable medium.

BACKGROUND Technical Field

The present disclosure relates to a computer-implemented method forprocessing an image in a neural network.

Description of the Related Art

Neural networks, more precisely a convolutional neural network (CNN),are known, for example, from “Object Recognition with Gradient-BasedLearning” by Yann LeCun et al. In particular, finding an appropriate setof features is discussed therein. CNNs prove to be particularly wellsuited for this task.

Digital cameras are used in various industrial applications. The sceneto be recorded is usually illuminated using an illumination unit andimaged on an image sensor of the camera with the aid of an objective.Any processing steps can then be applied to the raw image formed,wherein it can subsequently be transmitted via a correspondinginterface.

Digital cameras are used, inter alia, in traffic engineering, forexample for detecting license plates, for checking product quality inproduction, and in monitoring technology. Samples, e.g., cell samples,are frequently evaluated in medical technology and also in the field ofbiology. The investigation of the sample is usually carried out with theaid of a microscope on which a digital camera is mounted. The camerarecords the sample and subsequently provides the user with a digitalimage of the sample. This image can then be used for furtherinvestigations, such as counting or classifying cells.

There are also many further applications for digital cameras and images,e.g., in the areas of astronomy, medicine, physics, biology, chemistryand art history, to name just a few.

In order to be able to accomplish the respective tasks, sufficient imagequality is required. This means that the image content relevant to thetask must be able to be recognized clearly enough in the image and isnot unrecognizable due to image noise or other effects.

In addition, it is important, especially in the field of medicaltechnology, that no image contents are distorted by the correctionmethod or individual operations. In microscopy, exposure times of a fewmilliseconds up to several minutes are common. The evaluation time of aprocessing method should accordingly be within a similar time frame.

The removal of image noise, so-called denoising, is an important methodfor improving the image quality or for maintaining the image contentswhen processing image data.

The denoising can be calculated directly in the camera or in externalhardware, for example in a computer, e.g., the customer's computer witha processor (CPU) or with a graphics processor (GPU). It is thereforedesirable to make economical use of the resources as well as of thecomputing time in order to minimize the burden on the customer'scomputer and to save hardware resources. Furthermore, it is possible tocarry out the calculation in the camera in a field-programmable gatearray (FPGA) or in specific hardware, such as a frame grabber ormicrocontroller (MCU).

In this case, the demand for resources should also be kept to a minimumsince computing power, energy, and chip area are always associated withcosts.

Medical or biological samples are frequently very sensitive and can bedamaged or destroyed by an excessive amount of light. Therefore, imagingrecordings of such samples are frequently produced using small amountsof light, e.g., at low light intensity, as is often the case, forexample, in microscopy. Under the mostly correct assumption that acamera that at least largely meets the EMVA1288 standard is involved inthe recording, the result is that the recording subsequently has a lowsignal-to-noise ratio. This can significantly impair the recognizabilityand usability of the image. It is therefore desirable to improve thesignal-to-noise ratio by reducing noise in the image.

If a sample is exposed to a small amount of light during a recording, alow exposure of the image sensor generally results. Significant forms ofimage noise at low exposures are primarily photon noise, alsophoton-shot noise, and dark noise, which is dominated by the electricalreadout noise, also “read noise.”

In the case of a low level of light intensity, exposure can in principlebe increased by increasing the exposure time. As a result, the photonnoise can usually be reduced and the recognizability and usability ofthe image can thus be improved. However, an increase in the exposuretime according to EMVA1288 also leads to an increase in the dark noiseand to an increase in the number of so-called hot pixels in the image.

So-called dark noise occurs without light impinging on the sensor of acamera. The reason for this noise is, on the one hand, the dark currentof the individual light-sensitive elements, i.e., pixels, and, on theother hand, also the noise of the read-out amplifier. Dark noise occurs,for example, in fluorescence microscopy.

Hot pixels are pixels which do not react proportionally to the incidentlight, as a result of which, particularly in the case of long exposuretimes, they reproduce image values which are clearly too brightpunctiformly (cf. EMVA 1288). The number of hot pixels rises primarilyin the case of long exposure times and high ISO values or gain values.As the temperature increases, the number of hot pixels increases, andone method for avoiding hot pixels and image noise is therefore to keepthe temperature of the camera low. Many digital cameras, which areintended for long exposure times, have cooling of the image sensor. Hotpixels are mainly caused by manufacturing inaccuracies and lead tofurther degradation of the image quality and in turn impair therecognizability and usability of the image.

A large number of “denoising” algorithms already exists, wherein a firstapproach however requires a series of image recordings, such as in J.Boulanger et al., “Patch-based non-local functional for denoisingfluorescence microscopy image sequences,” 2010.

Another approach initially generates synthetic training data by addingdefined noise, such as Gaussian noise, to a noise-free data set, e.g.,J. Xie et al., “Image Denoising and Inpainting with Deep NeuralNetworks,” 2012. However, according to the definition of the EMVA 1288standard, this does not represent a realistic noise model for digitalcameras since Gaussian noise is not realistic.

Furthermore, numerous deep learning approaches exist, which preferablyused CNNs with different network architectures, such as autoencoders,e.g., V. Jain et al., “Natural image denoising with convolutionalnetworks,” 2008, and H. C.Burger, “Image denoising: Can plain neuralnetworks compete with BM3D?” in IEEE Conference on Computer Vision andPattern Recognition, 2012.

In principle, however, the known disadvantages of deep learningapproaches are the amount of data required and the resources needed aswell as time-consuming training. A further common disadvantage is thelack of generalizability of the networks. This means that the networkgenerally works well for the trained samples but not for other types ofsamples.

In addition to deep learning approaches, further denoising methodsexist, which are based on so-called surface learning or classicalapproaches, such as P. Milanfar, “Fast, Trainable, Multiscale Denosing,”2018, S. Srisuk, “K-means based image denoising using bilateralfiltering and total variation,” 2014, or J. Ehmann, “Real-time videodenoising on mobile phones,” 2018.

The disadvantage of these methods is that various approaches, elements,or filters frequently have to be trained or developed separately atgreat expense and subsequently have to be combined.

BRIEF SUMMARY

Described herein is a correction method which both reduces the noise indigital images and at the same time does not distort the content of theimage.

A method according to the disclosure for processing an image comprisesthe steps of dividing the image into at least two image regions,grouping at least one of the image regions into at least one group,wherein each of the image regions is assigned to exactly one group or tono group, and applying at least one kernel in a layer of a neuralnetwork to the image regions of at least one group, wherein at most onekernel is applied to the image regions of each group.

Furthermore proposed are a method for training a neural network as wellas a device for image processing by means of a neural network and fortraining a neural network, and a computer-readable medium.

The kernels proposed here can in particular be convolution kernels,wherein the layer in which the kernels are applied to the image therebybecomes a convolutional layer in the neural network. In particular, thefact that the kernel is a convolution kernel means that the kernel isflipped, i.e., the kernel matrix is mirrored vertically. Without thevertical mirroring of the matrix, the operation is usually referred toas cross-correlation. However, since a cross-correlation within a neuralnetwork is always assumed here, the operation is frequently referred toas convolution even without vertical mirroring of the matrix. The layercan therefore be referred to in any case as a convolutional layer.

BRIEF DESCRIPTION OF THE SEVERAL VIEWS OF THE DRAWINGS

Advantages, features, and characteristics of the disclosure areexplained by the following description of preferred embodiments of theaccompanying drawings, in which:

FIG. 1 shows a method according to an exemplary embodiment,

FIG. 2 shows a schematic exemplary image which is divided into imageregions of different sizes,

FIG. 3A shows an application of kernels to an image according to theprior art,

FIG. 3B shows an application of kernels to an image according to theprior art with zero padding,

FIGS. 4A and 4B show an application of kernels to an image according toan exemplary embodiment,

FIG. 5 shows an image for training a neural network according to anexemplary embodiment,

FIG. 6 shows a method for training a neural network according to anexemplary embodiment,

FIG. 7 shows a device for processing an image in a neural networkaccording to an exemplary embodiment, and

FIG. 8 shows a schematic, exemplary method of an application fordenoising.

DETAILED DESCRIPTION

The same or similar reference signs are used in the following for thesame or similar components or steps.

FIG. 1 shows a method 100 for processing an image according to anexemplary embodiment. In this case, an image is first divided into atleast two image regions in step 110. The image regions may be the samesize, e.g., 8×8 pixels, 16×16 pixels, 32×32 pixels. Alternatively, theimage regions may also be different sizes.

The image can, for example, be divided into the image regions byapplying a decomposition algorithm or a segmentation algorithm. In thiscase, the algorithms used can operate in a pixel-oriented,edge-oriented, region-oriented, model-based, texture-based manner.

Such algorithms are, for example, watershed transformation, thresholdvalue method, random walk, quadtree, octtree, Felzenszwalb-Huttenlocheralgorithm, Sobel operator, Laplace operator, live wire method, parallelor sequential edge extraction, optimal edge search, active shape modelsand snakes.

Further methods for dividing the image into image regions areconceivable, however, and the methods explained above serve merely asexamples. In a further embodiment, image regions can overlap, i.e.,there are regions that belong to more than one image region. This isdescribed in more detail below.

FIG. 2 shows, by way of example, the division of an image into imageregions which results from the application of a quadtree (H. Samet, “TheQuadtree and Related Hierarchical Data Structures,” 1984).

Subsequently, the image regions are grouped in step 120. In doing so, agroup is assigned to at least one the image region. It is also possiblefor image regions that are not assigned to a group to be left over. Thisgrouping is also known as classifying, classification, or clustering. Noimage region can be assigned to more than one group so that each imageregion is assigned either to exactly one group or to no group.

The image regions are grouped here on the basis of one or more of theirproperties, for example, so that image regions in which these propertiesare identical or similar are assigned to the same group. The measure ofsimilarity can be varied in order to obtain more or fewer groups. Thisstandard classifying technique is known to the person skilled in the artand is therefore not explained further here.

One or more property criteria of the image regions can be used forgrouping the image regions. Examples of such property criteria are theintensity of the image regions, the orientation of the intensitygradient within the image regions, the colors or color values of theimage regions, the structures within the image regions, edge orientationand, in the case of differently sized image regions, the size of theimage regions can also be used for grouping. Further properties are alsoconceivable here.

The methods used for classification can be numerical or non-numerical,statistical or distribution-free, monitored or non-monitored,permanently dimensioned or learning, and parametric or non-parametricmethods.

Exemplary so-called clustering or classification methods are, forexample, k-means, support vector machines (SVMs) or Gaussian mixturemodels, the latter also multivariate. Further classification methods areknown to the person skilled in the art, however, and any suitablealgorithm that clusters the image regions can be used.

In the following step 130, at least one kernel in the layer of a neuralnetwork is applied to the image regions of at least one group, whereinat most one kernel is applied to the image regions of each of thegroups.

FIG. 3A shows the application of kernels to images to be processedaccording to the prior art. In this case, each kernel is applied to theentire image in regions.

By way of explanation, an image 310, which is 8×8 pixels in size, isassumed as an example. The following values for the pixels are assumedby way of example:

11 5 8 3 6 2 5 1 6 15 4 3 6 1 15 9 9 10 4 2 7 3 11 10 3 5 5 4 7 2 4 5 12 6 7 8 1 7 6 10 1 10 8 9 4 2 4 11 2 1 11 2 4 1 3 3 1 2 3 1 7 8 9

The kernels are 3×3 in size. After each calculation step, the kernel ismoved to the right by one column. When the end of a row is reached, thekernel is applied to the next row. The size of the displacementcorresponds in each case to the stride parameter and can also bedifferent from 1. With a stride parameter of 1, in the case of twokernels 320 that are 3×3 in size, two result matrices 330 that are 6×6pixels in size are produced.

The individual values of the image 310 here represent the values of theimage dots, i.e., pixels, of the image. Depending on the bit depth ofthe image, the values comprise different value ranges. If the image 310is in 8-bit grayscale, for example, the pixel values range from 0 to255.

Each of the fields of the result matrices 330, each of which correspondsto one pixel of the image, is calculated, for example, as the sum of theproducts of the values of the image 310 with the respectivecorresponding value of the kernel 320 in a so-called cross-correlation.

In the example shown in FIG. 3A, each kernel starts at the beginningabove the region in the top left corner of the image 310. This resultsin the following values of Table 1 for the result matrices:

TABLE 1 Image Kernel Kernel Image value Image value value value 1 value2 Kernel value 1 Kernel value 2 11 1 2 11 22 5 2 2 10 10 8 2 2 16 16 6 10 6 0 15 1 1 15 15 4 1 0 4 0 9 1 0 9 0 10 0 1 0 10 4 1 1 4 4

The sum of the products is then entered into the result matrices 330,which sum is 75 or 77 in the above case.

The kernels are then shifted to the right by one column according to thestride parameter having the value 1, which results in the followingTable 2:

TABLE 2 Image Kernel Kernel Image value Image value value value 1 value2 Kernel value 1 Kernel value 2 5 1 2 5 10 8 2 2 16 16 3 2 2 6 6 15 1 015 0 4 1 1 4 4 3 1 0 3 0 10 1 0 10 0 4 0 1 0 4 2 1 1 2 2

Consequently, the values 61 and 42 would be entered into the next fieldsof the result matrices 330.

The example shown in FIG. 3A shows an image without so-called zeropadding. Zero padding means that the edges of the image are surroundedby zero values. When a kernel with k columns and rows is applied, thedimensions of the image are reduced by k−1.

The dimension of the result matrix are reduced in the conventionalmethod without zero padding and a stride parameter of 1 as follows:

Result matrix_(x)=Input image_(x)−(Kernel_(x)−1), or

Result matrix_(y)=Input image_(y)−(Kernel_(y)−1).

Result matrix_(x/y), input image_(x/y), and kernel_(x/y) respectivelydenote the x or y dimension of the result matrix, input image, andkernel.

If the number of columns and rows is increased by k−1 in each casebefore the kernels are applied, the result matrices in turn have thesame size as the input image. However, even with zero padding, there isno change to the method shown so that a person skilled in the art canalso easily apply said method to an image with zero padding. This isshown in FIG. 3B. Due to the additional values at the edge of the image,the image increases to 10×10 pixels, and the result matrices increase to8×8 values as a result, which in turn corresponds to the size of theinput image.

FIGS. 4A and 4B show the cross-correlation for the inventive method,wherein only one kernel is applied in each case for the divided imageregions that are each assigned to a group. From FIG. 4A, at thetransition point of the drawings marked with IV, a connection is madefrom FIG. 4A to FIG. 4B.

For example, an image 410 having the same values as before is assumed asinput image:

11 5 8 3 6 2 5 1 6 15 4 3 6 1 15 9 9 10 4 2 7 3 11 10 3 5 5 4 7 2 4 5 12 6 7 8 1 7 6 10 1 10 8 9 4 2 4 11 2 1 11 2 4 1 3 3 1 2 3 1 7 8 9

In FIGS. 4A and 4B, the image regions are the same size, i.e., theoriginal image 410 having 8×8 pixels is divided here as an example intofour regions, each comprising 4×4 pixels. The dividing of the image 410into image regions can be seen in image 415. For the example, these fourregions look as follows:

11 5 8 3 6 15 4 3 9 10 4 2 3 5 5 4

6 2 5 1 6 1 15 9 7 3 11 10 7 2 4 5

1 2 6 7 10 1 10 8 11 2 1 11 3 1 2 3

8 1 7 6 9 4 2 4 2 4 1 3 1 7 8 9

By way of example, the four regions in FIGS. 4A and 4B are divided intotwo groups, wherein the top left and the bottom right region areassigned to group 1 and the remaining regions are assigned to group 2.

The individual values of the image 410 in turn represent the values ofthe image dots, i.e., pixels, of the image. Depending on the bit depthof the image, the values comprise different value ranges. If the image410 is, for example, in 8-bit grayscale, the pixel values range from 0to 255. In addition, the image 410 can represent only one channel of anoverall image, depending on the color model of the overall image. Thus,image 410 can, for example, only represent one of the red, green or bluecolor channels if the overall image is in the RGB color model. In asimilar way, and easily recognized by the person skilled in the art, thepresent method is however also suitable for every other color space,e.g., YCMK, YCbCr, YUV, HSV. Depending on the color space, therespective value of an image dot represents the intensity of the colorchannel or the brightness or saturation.

Since, in the case of a stride parameter of 1 to be used within therespective image regions, a 3×3 kernel 420 can be placed in a 4×4 imageregion exactly 4 times, four values for each image region result in theresult matrix. The stride parameter can also specify a different value.

The following values are used by way of example for the kernels:

Kernel 1: 1 2 2 1 1 2 1 1 2

Kernel 2: 1 1 1 1 1 1 2 2 2

In this case, the individual image regions are individually correlatedwith the respective kernel, and the entire result matrix is subsequentlyreassembled from the individual results.

For the example shown in FIGS. 4A and 4B, corresponding values resultfor the 4 image regions and represent partial results of the entireresult matrix 430. Since, in comparison to the conventional methoddescribed in FIGS. 3A and 3B, the regions do not overlap, a resultmatrix 430 of 4×4 pixels results. In the proposed method, the reductionin dimensions in relation to the input image can be calculatedrelatively easily for image regions of the same size and a strideparameter of 1 as follows:

Result matrix_(x)=Input image_(x)−(Kernel_(x)−1)×Regions_(x), or

Result matrix_(y)=Input image_(y)−(Kernel_(y)−1)×Regions_(y).

Result matrix_(x/y), input image_(x/y), and kernel_(x/y) respectivelydenote the x or y dimension of the result matrix, input image, andkernel. Regions_(x/y) denotes the number of image regions in the x or ydirection.

The following applies: Total number of imageregions=Regions_(x)×Regions_(y).

With differently-sized image regions, the calculation is somewhat morecomplex, but the size of the input image can be achieved with zeropadding of the regions or by targeted overlap, i.e., padding ofneighboring regions.

The calculation of the partial results from an image region and theassociated kernel in each case can be carried out successively orsimultaneously, i.e., in parallel. It is irrelevant whether the imageregions are of equal size or of different sizes.

In order to obtain a result matrix in the same size as the input image,the individual image regions can be increased accordingly in theproposed method using the zero padding method or by padding with theneighboring pixels.

When image regions overlap, the average value for those pixels for whichthere is more than one value is formed from all the available values forthe respective pixel when the entire result matrix is compiled.

The advantage becomes obvious in the case of correspondingly largerimage dimensions. For example, in the case of an input image that is128×128 pixels in size, 16 result matrices that are 128×128 in size areproduced in the conventional method with 16 filters, assumingcorresponding padding. The overall data volume is thus 128×128×16.

By using the proposed method, however, irrespective of the number ofkernels, only one matrix that is 128×128 in size is always produced,again assuming corresponding padding, which means a data reduction by afactor of 16. In this way, with the same number of kernels, hardwareresources can be saved, and it is even specifically the case that thedata volume of the result matrix is identical to the data volume of theinput image.

Moreover, since each kernel is applied to each pixel in the conventionalmethod, the proposed method results in a reduction of the computingeffort.

Furthermore, the proposed method can also have an advantageous effect onthe image quality, wherein image regions of the same group are processedin the same way, but image regions of different groups are processed inanother way. The structures of the input image can thus be bettermaintained.

This not only has an advantage for denoising but also for other tasks,such as deblurring or inpainting. Furthermore, classification tasks canbe accelerated by, for example, not processing homogeneous image regionssince often only the edges are frequently of interest.

The application of the proposed method for denoising for fluorescencemicroscopy is explained below as a further application. In this case,primarily the hot pixels and the dark noise are to be reduced sincethey, according to the EMVA 1288 standard, depend directly on theexposure time.

It is assumed here that the images to be corrected are recorded by adigital camera mounted on a fluorescence microscope.

In this case, the clustering algorithm is trained on the basis oftraining examples so that it can classify the image regions createdaccording to the selected properties. The image regions created can thusbe subsequently processed by the network in another way according totheir label, i.e., to which group they belong.

The input image here is, for example, a grayscale image or color imagefor which the noise is to be reduced. The individual image regions weregrouped according to the clustering algorithm. Optionally, there is alist of image coordinates of the individual image regions. The inputimage is now processed by the proposed application of kernels 130 to theimage regions. This results in small structures and edges.

Furthermore, before, during, or after this application of kernels 130 tothe image regions, the input image is processed, for example, by apooling layer, preferably by a median or min pooling layer, as a resultof which the number of hot pixels is reduced. The proposed applicationof kernels can likewise be applied to the result image of saidprocessing. In order to preserve the original image dimensions, eitherthe input image or the result image can be processed by a correspondingupsampling layer. A detailed description of the layers can be found, forexample, in the Keras library documentation.

The result images both from the actual application of kernels and fromthe pooling layer are subsequently combined in a merge layer. This canbe carried out, for example, by means of an add layer or an averagelayer. Finally, further ones of the proposed application of kernels canbe applied.

The network can be trained by a corresponding data set. This data setcan, for example, consist of image pairs which each consist of a noisyimage and an associated noise-free image.

The network architecture can be trained in a so-called end-to-endmethod.

In the inference of the network, it is conceivable that image regionsare assigned to a particular cluster group that was not available in thetraining data. As a result, the filters or kernels of the proposed layerassigned to the corresponding cluster group were not trained.

FIG. 6 shows a method 200 for training a neural network. The network istrained here on at least one image, a so-called test pattern. The testpattern contains the desired feature attributes, e.g., different edgeorientations, colors, or intensities. An exemplary test pattern is shownin FIG. 5 .

In this case, in step 210, an image divided into at least two imageregions is read first. The division can be established beforehand or canbe carried out in the layer of the neural network in which the kernel isalso subsequently trained, in another layer of the neural network or inanother neural network.

This is followed in a step 220 by a grouping of at least one of theimage regions into at least one group, wherein each of the image regionsis assigned to exactly one group or to no group.

In this case, the grouping can also be established beforehand or can becarried out in the layer of the neural network in which the kernel isalso subsequently trained, in another layer of the neural network, or inanother neural network.

Subsequently, in a step 230, at least one kernel 420 is applied in alayer of the neural network or in one of the neural networks to theimage regions of at least one group, wherein at most one kernel isapplied to the image regions of each group.

Training is explained in more detail below with reference to an example,denoising.

By using a test pattern, the clustering algorithm can learn the desiredcluster groups accordingly. In addition, it is ensured that the kernelsare trained for all cluster groups. In order for the network to be ableto learn a denoising method, for example, the test patterns aresuperimposed with corresponding noise. The training with test patternsadditionally avoids an overadapting of the weights of the network to,for example, certain types of samples of the training data.

The network thus learns denoising, for example, for different edgeorientations but not for certain sample types. If the network istrained, for example, on edge orientations, significantly fewer trainingdata are required for this purpose than if the network is to be traineddirectly on the microscopic samples.

For the training, using the example of denoising, a noisy input imagethat was already divided into image regions that are also alreadyclustered is fed to the neural network. In order to determine thesuccess during training, the corresponding noise-free image is provided.The neural network is then trained by the known method of backpropagation. However, since, according to the inventive method, a kernelis in each case only applied, i.e., filtered, to specific image regionsin the image according to the clustered groups, the method has theadvantage that the training can be carried out with only one testpattern. Several image pairs consisting of noisy and noise-free imagescan also be used but, in contrast to conventional training methods, oneimage pair is sufficient.

The input for the training is the noisy test pattern having imageregions that have already been grouped. The so-called ground truth,i.e., the ideal result image, is the noise-free test pattern. Eachfilter, i.e., kernel, is adapted by the back propagation method suchthat it best denoises on average all image regions to which it isapplied.

Because the image regions of one cluster group respectively all have thesame properties, e.g., the same edge orientations or the like asexplained above, this filter learns to denoise the correspondinglylabeled image regions in this property. Of course, this also applies tothe other filters.

It is also possible to extend an already existing neural network by theconvolutional layer according to the present disclosure or to replace analready existing layer in a neural network by the convolutional layeraccording to the disclosure.

In this case, it is also possible to train only the layer according tothe present disclosure. In general, however, the network is usuallytrained as a whole, and not just a layer in a network.

If the dividing of the image and the grouping of the image regions takesplace outside the network, corresponding cluster methods naturally needto be trained beforehand, if they are used.

Due to different settings of the camera gain, for example, and of theexposure time, images with varying degrees of noise are produced. It istherefore conceivable that, after processing using the denoising method,images with a high degree of noise, for example, do not achieve the sameimage quality as images with a lower degree of noise. Therefore, ifinformation about the set camera gain and the exposure time areavailable, a different processing of the input image can be carried outaccordingly.

Furthermore, any input dimensions are possible after the network hasbeen trained. In the case of a method according to the prior art,however, the input dimensions for the classical convolutional layer mustbe defined at the beginning of the training.

Further processing steps 125 can optionally be performed before the justdiscussed application of at least one kernel 130 to the groups of imageregions. These further processing steps 125 can comprise conventionalconvolution, i.e., applying kernels according to the prior art, ordifferent processing, as already explained above, such as pooling,upsampling, etc.

However, the steps 125 can also comprise applying at least one kernel130 according to the exemplary embodiment described above. In this case,the dividing of the image into the image regions can be identical tostep 130 or different.

After the application discussed above of at least one kernel 130 to thegroups of image regions, further processing steps 135 can optionally beperformed. These further processing steps 135 can comprise conventionalconvolution, which is known to the person skilled in the art, but steps135 can also include applying at least one kernel 130 according to theexemplary embodiment described above. In this case, the dividing of theimage into the image regions can be identical to step 130 or different.

The steps discussed above of dividing 110, grouping 120 and the possiblefurther processing steps 125 and/or 135 can be carried out in this casein the layer of the neural network in which the at least one kernel isapplied 130. However, it is also conceivable that some or all of thesesteps are carried out in another layer of the same neural network or inanother neural network. In addition, some or all of these steps may alsobe performed outside of neural networks. Furthermore, the further steps125, 135 may also be performed at least partially in parallel to theapplication of the at least one kernel 130, so that the furtherprocessing steps 125, 135 may be performed before, during, or after theapplication of the at least one kernel 130.

Similarly, the same applies to the training method 200 in which thesteps of reading 210 and grouping 220 can be performed in the layer ofthe neural network in which at least one kernel 230 is applied, inanother layer of the neural network, in another neural network oroutside of neural networks.

Further examples of processing steps which can be carried out within thesame layer are, for example, the application of an activation function,for example Relu, Softmax, or the addition of a bias value, or batchnormalization. Outside of the layer, in another layer or anothernetwork, any other processing by means of layers can then follow, suchas pooling, upsampling, or fully connected, i.e., dense layer.

If the steps of dividing and grouping in steps 125 and/or 135 aredifferent from step 130, these differences may consist in that anotheralgorithm is used for dividing and/or grouping and/or in that otherparameters are used for the dividing and/or grouping.

However, steps 125 and/or 135 can also include other processing steps ofthe image, e.g., processing steps that include, for example, preparatorymeasures, such as cutting, color reduction, change in contrast or thelike. In addition, it is also conceivable that further processing stepsare carried out in parallel to the application of the kernel 130, whilethe processing is running. The results, which are achieved separately,can subsequently be combined or further processed separately.

Furthermore, the processed image can be output in a subsequent step 140.The output of the image does not differ from the output of an imageknown to the person skilled in the art.

FIG. 7 shows a device 500 for processing an image in a neural network.This comprises at least one memory 510, wherein at least one image isstored in at least one of the memories. Furthermore, the device 500comprises at least one processor 520, wherein at least one of theprocessors is configured to perform, as described above, a method forprocessing an image in a neural network or a method for training a layerof a neural network.

Additionally described is a computer-readable medium which comprisescommands that, when executed by a processor, cause the latter to performthe steps of a method for processing an image in a neural network or ofa method for training a layer of a neural network as described above.

FIG. 8 shows by way of example and schematically the process of theentire denoising method. Here, the input image 810 is initiallysubdivided into image regions 820. Subsequently, the image regions aredivided by a previously-trained clustering algorithm into the clustergroups 1 to 100. This results in the labeled input image 830, here withthe groups 1 to 8. The noise-reduced result image 850 is then created bythe inventive method in a neural network 840. In this example, the stepsof dividing and grouping take place outside the convolutional layer ofthe neural network.

It is conceivable that these steps are performed beforehand in anotherlayer of the same neural network, by another neural network, or outsideof neural networks.

LIST OF REFERENCE SIGNS

100 Method for processing an image

110 Step: Dividing

120 Step: Grouping

125 Further steps

130 Step: Applying at least one kernel

135 Further steps

140 Step: Outputting

200 Method for training a neural network

210 Step: Reading an image

220 Step: Grouping

230 Step: Applying at least one kernel

310 Input image (prior art)

320 Kernel (prior art)

330 Result matrices (prior art)

410 Input image

415 Input image divided into grouped image regions

420 Kernels

430 Result matrices

500 Device

510 Memory

520 Processor

810 Input image

820 Image regions

830 Input image divided into grouped image regions

840 Neural network

850 Result image

The various embodiments described above can be combined to providefurther embodiments. All of the non-patent publications referred to inthis specification are incorporated herein by reference, in theirentirety. Aspects of the embodiments can be modified, if necessary toemploy concepts of the various publications to provide yet furtherembodiments.

These and other changes can be made to the embodiments in light of theabove-detailed description. In general, in the following claims, theterms used should not be construed to limit the claims to the specificembodiments disclosed in the specification and the claims, but should beconstrued to include all possible embodiments along with the full scopeof equivalents to which such claims are entitled.

1. A computer-implemented method for processing an image, comprising:dividing the image into at least two image regions; grouping at leastone of the image regions into at least one group, wherein each of theimage regions is assigned to exactly one group or to no group; andapplying at least one kernel in a layer of a neural network to the imageregions of at least one group, wherein at most one kernel is applied tothe image regions of each group.
 2. The method according to claim 1,wherein the kernel is a convolution kernel.
 3. The method according toclaim 1, wherein before, during, or after applying the at least onekernel, further processing steps of the image are performed; and/orwherein one or more of the further steps and of the steps of dividingand grouping are performed in the layer of the neural network, inanother layer of the neural network, in another neural network, oroutside of neural networks.
 4. The method according to claim 1, furthercomprising outputting the processed image after convolution.
 5. Themethod according to claim 1, wherein the dividing the image into atleast two image regions divides the image into image regions of equalsize.
 6. The method according to claim 1, wherein the dividing the imageinto at least two image regions divides the image into differently sizedimage regions; and/or wherein the dividing the image into at least twoimage regions is carried out by decomposition.
 7. The method accordingto any claim 1, wherein the grouping groups the image regions with thesame properties into one of the groups in each case.
 8. The methodaccording to claim 7, wherein the same properties of the image regionscorrespond to at least one of the following criteria: orientation of theintensity gradient; colors of the image regions; intensities of theimage regions; structures within the image regions; edge orientation;and/or in the case of differently sized image regions, image regions ofthe same size.
 9. The method according to claim 1, wherein the groupinggroups the image regions by means of clustering or classificationalgorithms.
 10. A computer-implemented method for training a neuralnetwork, comprising: reading an image that is divided into at least twoimage regions; grouping at least one of the image regions into at leastone group, wherein each of the image regions is assigned to exactly onegroup or to no group; and applying at least one kernel in a layer of aneural network to the image regions of at least one group, wherein atmost one kernel is applied to the image regions of each group.
 11. Adevice for processing an image in a neural network, comprising: at leastone memory, wherein at least one image is stored in at the least onememory; and at least one processor, wherein the at least one processoris configured to: divide the image into at least two image regions;group at least one of the image regions into at least one group, whereineach of the image regions is assigned to exactly one group or to nogroup; and apply at least one kernel in a layer of a neural network tothe image regions of at least one group, wherein at most one kernel isapplied to the image regions of each group.
 12. The device according toclaim 11, wherein the kernel is a convolution kernel.
 13. The deviceaccording to claim 11, wherein before, during, or after applying the atleast one kernel, the processor is further configured to perform furtherprocessing steps of the image; and wherein one or more of the furthersteps and of the steps of dividing and grouping are performed in thelayer of the neural network, in another layer of the neural network, inanother neural network, or outside of neural networks.
 14. The deviceaccording to claim 11, wherein the processor is further configured tooutput the processed image after convolution.
 15. The device accordingto claim 11, wherein the dividing the image into at least two imageregions divides the image into image regions of equal size.
 16. Thedevice according to claim 11, wherein the dividing the image into atleast two image regions divides the image into differently sized imageregions; and/or wherein the dividing the image into at least two imageregions is carried out by decomposition.
 17. The device according toclaim 11, wherein the grouping groups the image regions with the sameproperties into one of the groups in each case.
 18. The device accordingto claim 17, wherein the same properties of the image regions correspondto at least one of the following criteria: orientation of the intensitygradient; colors of the image regions; intensities of the image regions;structures within the image regions; edge orientation; and/or in thecase of differently sized image regions, image regions of the same size.19. The device according to claim 11, wherein the grouping groups theimage regions by means of clustering or classification algorithms.
 20. Acomputer-readable medium comprising commands which, when executed by aprocessor, cause the processor to perform the steps of the methodaccording to claim 1.